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Evolution produces complex and structured 
networks of interacting components in chem- 
ical, biological, and social systems. We de- 
scribe a simple mathematical model for the 
evolution of an idealized chemical system to 
study how a network of cooperative molecu- 
lar species arises and evolves to become more 
complex and structured. The network is mod- 
eled by a directed weighted graph whose pos- 
itive and negative links represent 'catalytic' 
and 'inhibitory' interactions among the molec- 
ular species, and which evolves as the least 
populated species (typically those that go ex- 
tinct) are replaced by new ones. A small auto- 
catalytic set (ACS), appearing by chance, pro- 
vides the seed for the spontaneous growth of 
connectivity and cooperation in the graph. A 
highly structured chemical organization arises 
inevitably as the ACS enlarges and perco- 
lates through the network in a short, ana- 
lytically determined time scale. This self- 
organization does not require the presence of 
self-replicating species. The network also ex- 
hibits catastrophes over long time scales trig- 
gered by the chance elimination of 'keystone' 
species, followed by recoveries. 

Structured networks of interacting components 
are a hallmark of several complex systems, for ex- 
ample, the chemical network of molecular species in 
cells [Q , the web of interdependent biological species 
in ecosystems [|],|| , and social and economic networks 
of interacting agents in societies [Q-Q] . The structure 
of these networks is a product of evolution, shaped 
partly by the environment and physical constraints 
and partly by the population (or other) dynamics in 
the system. For example, imagine a pond on the pre- 
biotic earth containing a set of interacting molecular 
species with some concentrations. The interactions 
among the species in the pond affect how the popu- 



lations evolve with time. If some population goes to 
zero, or if new molecular species enter the pond from 
the environment (through storms, floods or tides), 
the effective chemical network existing in the pond 
changes. We discuss a mathematical model that at- 
tempts to incorporate this interplay between a net- 
work, populations, and the environment in a simple 
and idealized fashion. The model (including an ear- 
lier version was inspired by the ideas and re- 



sults in refs. ||10|-|l8[. Related but different models 
are studied in refs. [19h21|. 



The Model 

The system consists of s species labeled by the index 
i = 1, 2, . . . , s. The network of interactions between 
species is specified by the s x s real matrix C = {c^}. 
The network can be visualised as a directed graph 
whose nodes represent the species. A non-zero Cy is 
represented by a directed weighted link from node j 
to node i. If cy > then the corresponding link is a 
cooperative link: species j catalyzes the production 
of species i. If c-ij < it is a destructive link: the 
presence of j causes a depletion of i. 

Population dynamics. The model contains an- 
other dynamical variable x = (x%, . . . .x s ), where Xi 
stands for the relative population of the i th species 
(0 < Xi < 1, Y^,i=i x i = !)■ The time evolution of 
x depends upon the interaction coefficients C, as is 
usual in population models. The specific evolution 
rule we consider is 



= fi 
= 

where fi 



if Xi > or fi > 0, 
if Xi = and fi < 0, 
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This is a particularly simple idealization of catalyzed 
chemical reaction dynamics in a well stirred reactor, 
as explained later. 



Graph dynamics. The dynamics of C in turn de- 
pends upon x, as follows: Start with a random graph 
of s nodes: Cy is non-zero with probability p and 
zero with probability 1 — p. If nonzero, Cy is chosen 
randomly in the interval [—1,1] for i j and [—1,0] 
for i = j. Thus a link between two distinct species, 
if it exists, is just as likely to be cooperative as de- 
structive, and a link from a species to itself can only 
be inhibitive, i.e., autocatalytic or self-replicating in- 
dividual species are not allowed. The variable x is 
initialized by choosing each Xi randomly between 
and 1, and then rescaling all Xi uniformly such that 
Si=i x i — 1- The evolution of the network proceeds 
in three steps : 

1) Keeping the network fixed, the populations are 
evolved according to ([!]) for a time T which is large 
enough for x to get reasonably close to its attractor. 
We denote X, = x t (T). 

2) The set of nodes i with the least value of Xi is 
determined. We call this the set of 'least fit' nodes, 
identifying the relative population of a species in the 
attractor (or, more specifically, at T) with its 'fit- 
ness' in the environment defined by the graph. One 
of the least fit nodes is chosen randomly (say io) and 
removed from the system along with all its links leav- 
ing a graph of s — 1 species. 

3) A new node is added to the graph so that it again 
has s nodes. The links of the added node (cu and 
Ci a i, for % = 1, . . . , s) are assigned randomly according 
to the same rule as for the nodes in the initial graph. 
The new species is given a small relative population 
%i = and the other populations are rescaled to 
keep J2t=i x i — 1- This process, from step 1 onwards, 
is iterated many times. 

Motivation for model structure. The choice of 
Eq. (jl]) is motivated by the rate equations in a well 
stirred chemical reactor (representing, say, a prebiotic 
pond) as follows: If species j catalyses the ligation of 

reactants A and B to form the species i, A + B i, 
then the rate of growth of the population yi of species 
i will be given by i/i = k(l + vy^)nAnB — 4>Vii where 
riA, tib are reactant concentrations, k is the rate con- 
stant for the spontaneous reaction, v is the catalytic 
efficiency, and <j) represents a common death rate 
or dilution flux in the reactor [Q. Assuming the 
catalysed reaction is much faster than the sponta- 
neous reaction, and the concentrations of the reac- 
tants are large and fixed, the rate equation becomes 
ih = cyj — <fiyi, where c is a constant. If species i 
has multiple catalysts, we get iji = J2j=i c ijVj ~ 4>yi- 



The first of equations ([j]) follows from this upon using 
the definition Xi — J/i/X^=i Vj- When negative links 
are permitted, the second of equations (|I]) is needed 
in general to prevent relative populations from going 
negative. (With negative links, a more realistic chem- 
ical interpretation would be obtained if ii were pro- 
portional to but for simplicity we retain the form 
of ([!]) in this paper.) (|l|) may be viewed as defining 
an artificial chemistry in the spirit of refs. |T^— [l7|l . 

The rules for the evolution of the network C are 
intended to capture two key features of natural evolu- 
tion, namely, selection and novelty. The species that 
has the least population in the attractor configuration 
is the one most likely to be eliminated in a large fluc- 
tuation in a possible hostile environment. Often, the 
least value of Xi is zero. Thus the model implements 
selection by eliminating from the network a species 
that has become extinct or has the least chance of 
survival |TJ| . Novelty is introduced in the network in 
the form of a new species. This species has on average 
the same connectivity as the initial set of species, but 
its actual connections with the existing set are drawn 
randomly. E.g., if a storm brings into a prebiotic 
pond a new molecular species from the environment, 
the new species might be statistically similar to the 
one being eliminated, but its actual catalytic and in- 
hibitory interactions with the surviving species can 
be quite different. Another common feature of natu- 
ral evolution is that populations typically evolve on 
a fast time scale compared to the network. This is 
captured in the model by having the Xi relax to their 
attractor before the network is updated. The ideal- 
ization of a fixed total number of species s is one that 
we hope to relax in future work. 

The model described above differs from the one 
studied in (|, ^| in that it allows negative links and 
varying link strengths, and that the population dy- 
namics, given by (|lj), is no longer linear. The ear- 
lier model had only fixed point attractors; here limit 
cycles are also observed. Since C now has negative 
entries, the formalism of non-negative matrices no 
longer applies. 

Results 

Emergence of cooperation and interdepen- 
dence. Figure 1 shows a sample run. The same qual- 
itative behaviour is seen in each of several hundred 
runs performed for p values ranging from 0.00002 to 
0.01 and for s = 100, 150, 200. The fact that the 
ratio of number of cooperative to destructive links 
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at first remains constant at unity (statistically) and 
then increases by more than an order of magnitude 
is evidence of the emergence of cooperation. The in- 
crease in d by an order of magnitude is a quantitative 
measure of the increase of interdependence of species 
in the network. The increase in the total density of 
links (Z_|_ + l-)/s is another aspect of the increase of 
complexity of the system. Note that in the model 
selection rewards only 'performance' as measured in 
terms of relative population; the rules do not select 
for higher cooperativity per se. Since a new species is 
equally likely to have positive or negative links with 
other species, the introduction of novelty is also not 
biased in favour of cooperativity. The fact that this 
behaviour is not a consequence of any intrinsic bias 
in the model that favours the increase of cooperation 
and interdependence is evidenced by the flat initial 
region of all the curves. 

Autocatalytic sets. The explanation for the above 
behaviour lies in the formation and growth of certain 
structures, autocatalytic sets (ACSs), in the graph. 
An ACS is defined as a set of nodes such that each 
node has at least one incoming positive link from a 
node in the set. Thus an ACS has the property of cat- 
alytic closure, i.e., it contains a catalyst for each of its 
members [f23r-p5| . The simplest example of an ACS is 
a cycle of positive links. Every ACS is not such a cy- 
cle but it can be shown that an ACS must contain a 
cycle of positive links M. In Fig. 1, there is no ACS 
in the graph until n = 1903. A small ACS (which 
happens to be a cycle of positive links between two 
nodes) appears at n = n\ = 1904, exactly where the 
behaviour of the S\ curve changes. As time proceeds 
this ACS becomes more complex and enlarges until 
at n = n-i — 3643 the entire graph becomes an ACS. 
Z+ and d exhibit an increase and a decrease as the 
ACS comes to occupy a significant part of the graph. 
After the ACS first appears (at n = rii), the set of 
populated nodes in the attractor configuration (si in 
number), is always an ACS (except for certain catas- 
trophic events to be discussed later), which we call 
the 'dominant ACS'. The spontaneous appearance of 
a small ACS at some n = n\, its persistence (except 
for catastrophes), and its growth until it spans the 
graph at n — ri2, is seen in each of the several hun- 
dred runs mentioned earlier. The growth of the ACS 
across the graph between m and rii occurs exponen- 
tially (with stochastic fluctuations), 

Sl (n) * s^nje^" 1 ^' , r g = 2/p. (2) 



This agrees with simulations as shown in Figure 2. 
The average time scale r a = (n\) for the first appear- 
ance of the ACS is given, for sufficiently small p, by 
r a ~ 4/(p 2 s) (= 1600 for p = 0.005 and s = 100). 

Upto n = rii, the graph has no ACS. It has chains 
and trees of positive and negative links and possibly 
loops containing negative links. These latter struc- 
tures are not robust. For example, consider a chain 
of two positive links 1 — » 2 — ► 3. Since catalytic links 
are pointing to node 3, it will do well populationally 
compared to nodes 1 and 2. However, since 1 has no 
incoming catalytic links, its relative population will 
decline to zero under (|l]), and it can be picked for 
replacement in the next graph update. This can dis- 
rupt the chain and hence erode the 'well being' of 
node 3 until eventually after some graph updates the 
latter can also join the ranks of the least fit. Species 
3 gets eliminated eventually because it does not feed- 
back into and 'protect' species 1 and 2, on whom its 
'well being' depends. In a graph without an ACS 
no structure is protected from disruption. Since ev- 
ery node is liable to be replaced sooner or later, the 
graph remains as random as the initial graph (we have 
checked that the probability distribution of the num- 
ber of incoming and outgoing links at a node remains 
the appropriate binomial for n < ni). This explains 
why si, l± and d hover around their initial values. 

The picture changes the moment a small ACS ap- 
pears in the graph. The key point is that by virtue 
of catalytic closure, members of the ACS do well 
collectively in the population dynamics governed by 
(jl]). An ACS is a collective self-replicator and beats 
chains, trees and other non ACS structures in the 
population game, reducing their Xi to zero when it 
appears. Thus, since graph update proceeds by re- 
placing one of the nodes with X{ = (if present) 
with a new one, such a replacement being outside the 
dominant ACS can cause no damage to the links that 
constitute the ACS. That is why the ACS structure, 
once it appears, is much more robust than the non- 
ACS structures discussed earlier. If the new node 
happens to get an incoming positive link from the 
dominant ACS, it becomes part of it. Thus the dom- 
inant ACS tends to expand in the graph as new nodes 
get attached to it jl5| H||] and s% increases. In An 
graph updates the average increase in s\, which is 
the number of added nodes which will get a positive 
link from one of the Si nodes of the dominant ACS, 
is Asi w (p/2)s±An, for small p. This proves (@). 
(Note that the exponential growth described by (g) 
is not to be confused with the exponential growth of 
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populations yi of species that are part of the ACS. (J2J) 
reflects the growth of the ACS across the graph, or 
the increase in the number of species that constitute 
the ACS.) 

Since the dominant ACS grows by adding posi- 
tive links from the existing dominant ACS, the num- 
ber of positive links increases as the ACS grows. On 
the other hand nodes receiving negative links usu- 
ally end up being least fit, hence negative links get 
removed when these nodes are eliminated. Which 
novelty is captured thus depends upon the existing 
'context'; the network evolves by preferentially cap- 
turing links and nodes that 'latch on' cooperatively to 
the existing ACS and disregarding those that do not. 
The 'context' itself arises when the ACS structure 
first appears; this event transforms the nature of net- 
work evolution from random to 'purposeful' (in this 
case directed towards increasing cooperation). Be- 
fore the ACS appears nothing interesting happens 
even though selection is operative (the least popu- 
lated species are being eliminated). It is only after 
the ACS topological structure appears that selection 
for cooperation and complexity begins. 

Inevitability of autocatalytic sets. Note that the 
appearance of an ACS, though a chance event, is in- 
evitable. For sp -C 1, the probability that a graph 
not containing a two cycle will acquire one at the 
next time step is p 2 s/4 = q. Since the probability 
of occurrence of 3-cycles, etc., is much smaller, the 
probability distribution of arrival times n\ is approx- 
imated by P{n\) = q(l — q) ni ~ l , whose mean r a is 
1/q. Since this probability declines exponentially af- 
ter a time scale 1/q, the appearance of an ACS is 
inevitable, even for arbitrarily small (but finite) p. 

Occasionally in a graph update s± can decrease 
for various reasons. If the new node forms an ACS 
of its own with nodes outside the dominant ACS, 
and the new ACS has a higher population growth 
rate (as determined by ([!])) than the old ACS it 
drives the species of the latter to extinction and be- 
comes the new dominant ACS. Alternatively the new 
node could be a 'destructive parasite': it receives one 
or more positive links from and gives one or more 
negative links to the dominant ACS. Then part or 
whole of the ACS may join the set of least fit nodes. 
Structures that diminish the size of the dominant 
ACS or destroy it appear rarely. For example in 
Figure 1, destructive parasites appeared 6 times at 
n = 3388,3478,3576,3579,3592 and 3613. In each 
case si decreased by 1. 



Emergence of structure. At n = rii the whole 
graph becomes an ACS; the entire system can col- 
lectively self-replicate despite the explicit absence of 
individual self-replicators. Such a fully autocatalytic 
set is a very non-random structure. Consider a graph 
of s nodes and let the probability of a positive link ex- 
isting between any pair of nodes be p* . Such a graph 
has on average m* = p*(s — 1) incoming or outgoing 
positive links per node. For the entire graph to be an 
ACS, each node must have at least one incoming pos- 
itive link, i.e., each row of the matrix C must contain 
at least one positive element. Hence the probability, 
P, for the entire random graph to be an ACS is 
P = probability that every row has at least 
one positive entry 

= [probability that a row has at least 
one positive entry] 5 

= [1 — (probability that every entry 
of a row is < 0)] s 

= [I - (I - p*)*- 1 }* 

= [1 - (1 -m*/(s- I)) 3 - 1 } 3 . 
For large s and m* ~ 0(1), 

P» (l-e- m *) 3 =e~ as , (3) 

where a is positive and 0(1). At n = 712, we find in 
all our runs that Z+(ri2) = I* is greater than s but 
of order s, i.e., m* ~ O(l). Thus dynamical evolu- 
tion in the model via the ACS mechanism converts 
a random organization into a highly structured one 
that is exponentially unlikely to appear by chance. 
In the displayed run at n = ri2 the graph had 117 
positive links. The probability that a random graph 
with s = 100 nodes and m* = 1.17 would be an ACS 
is given by (g) to be w 10 -16 . 

Such a structure would take an exponentially long 
time to arise by pure chance. The reason it arises in- 
evitably in a short time scale in the present model 
is the following: a small ACS can appear by chance 
quite readily, and once appeared, it grows exponen- 
tially fast across the graph by the mechanism out- 
lined earlier. The dynamical appearance of such a 
structure may be regarded as the emergence of 'orga- 
nizational order'. The appearance of 'exponentially 
unlikely' structures in the prebiotic context has been 
a puzzle. The fact that in the present model such 
structures inevitably form in a short time may be 
relevant for the origin of life problem. 

The self-organization time scale in a prebiotic 
scenario. We now speculate on a possible applica- 
tion to prebiotic chemical evolution. Imagine the 
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molecular species to be small peptide chains with 
weak catalytic activity in a prebiotic pond alluded 
to earlier. The pond periodically receives an influx of 
new molecular species being randomly generated else- 
where in the environment, through tides, storms or 
floods. Between these influxes of novelty the pond be- 
haves as a well stirred reactor where the populations 
of existent molecular species evolve according to (a 
realistic version of) eq. ([[]) and reach their attractor 
configuration. Under the assumption that the present 
model captures what happens in such a pond, the 
growth timescale (J2J) for a highly structured almost 
fully autocatalytic chemical organization in the pond 
is T g = 2/p in units of the graph update time step. 
In this scenario, the latter time unit corresponds to 
the periodicity of the influx of new molecular species, 
hence it ranges from one day (for tides) to one year 
(for floods). Further, in the present model p/2 is the 
probability that a random small peptide will catalyse 
the production of another |^5| , and this has been esti- 
mated in g§: p/2 ~ l(T 5 -l(r 10 . With p/2 ~ 1CT 8 , 
for example, the time scale for a highly structured 
chemical organization to grow in the pond would be 
estimated to be of the order of 10 6 to 10 8 years. It is 
believed that life originated on Earth in a few hun- 
dred million years after the oceans condensed. These 
considerations suggest that it might be worthwhile to 
empirically pin down the 'catalytic probability' p (in- 
troduced in ^5|) for peptides, catalytic RNA, lipids, 
etc., on the one hand, and explore chemically more 
realistic models on the other. 

Catastrophes and recoveries in the network 
dynamics. After n = ri2 the character of the net- 
work evolution changes again. For the first time the 
least fit node will be one of the ACS members. Most 
of the time elimination of the node does not affect 
the ACS significantly and si fluctuates between s 
and s — 1. Sometimes the least fit node could be a 
'keystone' species which plays an important organiza- 
tional role in the network despite its low population. 
When such a node is eliminated many other nodes 
can get disconnected from the ACS resulting in large 
dips in si and d and subsequently large fluctuations 
in l + and Z_. These large 'extinction events' can be 
seen in Figure 3. Occasionally the ACS can even be 
destroyed completely. The system recovers on the 
time scale r g after large extinctions if the ACS is not 
completely destroyed; if it is and the next few updates 
obliterate the memory of previous structures in the 
graph, then again a time on average r a elapses before 



an an ACS arises and the self-organization process 
begins anew. It may be of interest (especially in ecol- 
ogy, economics, and finance), that network dynam- 
ics based on a fitness selection and the 'incremental' 
introduction of novelty, as discussed here, can by it- 
self cause catastrophic events without the presence of 
large external perturbations. 

Discussion 

We have described an evolutionary model in which 
the dynamics of species' populations (fast variables) 
and the graph of interactions among them (slow vari- 
ables) are mutually coupled. The network dynamics 
displays self-organization seeded by the chance but 
inevitable appearance of a small cooperative struc- 
ture, namely, an ACS. In a dynamics that penalizes 
species for low population performance, the collective 
cooperativity of the ACS members makes the set rel- 
atively robust against disruption. New species that 
happen to latch on cooperatively to this structure 
preferentially survive, further enlarging the ACS in 
the process. Eventually the graph acquires a highly 
non-random structure. We have discussed the time 
evolution of quantitative measures of cooperation, in- 
terdependence and structure of the network, which 
capture various aspects of the complexity of the sys- 
tem. 

It is noteworthy that collectively replicating ACSs 
arise even though individual species are not self- 
replicating. Thus the present mechanism is different 
from the hypercycle [p6| , where a template is needed 
to produce copies of existing species. Unlike the hy- 
percycle, the ACS is not disrupted by parasites and 
short-circuits and grows in complexity, as evidenced 
in all our runs. It can be disrupted, however, when 
it loses a 'keystone' species. 

It is also worth mentioning one departure from 
|l2j , in that we find that a fully autocatalytic system 
(or percolating ACS) is not needed apriori for self- 
organization. In the present model a small ACS, once 
formed, typically expands (see also |l5|]) and eventu- 
ally percolates the whole network dynamically. This 
dynamical process might be relevant for economic 
takeoff and technological growth in societies. 
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f Entire graph becomes 
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at n=1904 
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Figure 1. A run with parameter values s — 100, p — 0.005, and xo — 10 -4 . 

(a) Number of populated species, si, in the attractor of (0) (i.e., number of nodes with Xi > 0) after the 
n th addition of a new species (i.e., after n graph update time steps), (b) The number, l+, of positive links 
(cij > 0) in the graph (blue); the number, of negative links (green); and 'interdependency', d, of the 
species in the network (red). Interdependency is defined as d = (1/s) J2t=i di, where di is the 'dependency' 
of the i th node, di = J2 k - \ckj \h l k , where h l k is 1 if there exists a directed path from k to i and otherwise, di 
is the sum of (the absolute value of) the strengths of all links that eventually feed into i along some directed 
path. The curves have three distinct regions. Initially s± is small; most of the species have zero relative 
populations. 1+ and i_ also do not vary much from their initial (random graph) value (« ps 2 /2 = 25) and 
remain approximately equal, d hovers about its initial low value. In the second region si, 1+ and d show a 
sharp increase and I- decreases. In the third region si, l± and d level off (but with fluctuations) and almost 
all species have non-zero populations in contrast to the initial period. 
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Figure 2. Power law dependence of r g on p. Each data point shows the average of r g over 5 different runs 
with s = 100 and the given p value. The error bars correspond to one standard deviation. The best fit 
line has slope —1.02 ± 0.03 and intercept —0.08 ± 0.26 which is consistent with the expected slope —1 and 
intercept 0. 
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